Boosting Speaker Recognition Performance with Compact Representations
نویسندگان
چکیده
This paper describes a speaker recognition system combination approach in which the compact forms of MAP adapted GMM supervectors are used to boost the performance of a highdimensional supervector-based system or a combination of multiple systems. The compact supervector representations are subjected to a diagonal transformation to emphasize those dimensions that describe significant speaker information and to deemphasize noisy dimensions. Scores obtained from these representations are then combined with the scores obtained from high-dimensional supervector representations. The transformation parameters and the combination weights are estimated by minimizing a discriminative training objective function that approximates a minimum detection cost function. We carried out experiments on two NIST 2008 Speaker Recognition Evaluation English telephony tasks to compare the proposed approach with direct score combination obtained from lowand highdimensional supervector representations. We have found that the proposed approach yields up to 18% relative gain.
منابع مشابه
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملTitle Speaker Recognition Using Adaptively Boosted Decision Tree Classifier( Accepted Version )
In this paper, a novel approach for speaker recognition is proposed. The approach makes use of adaptive boosting (AdaBoost) and C4.5 decision trees for closed set, text-dependent speaker recognition. A subset of 20 speakers, 10 male and 10 female, drawn from the YOHO speaker verification corpus is used to assess the performance of the system. Results reveal that an accuracy of 99.5% of speaker ...
متن کاملBoosting Localized Features for Speaker and Speech Recognition
In this thesis, we propose a novel approach for speaker and speech recognition involving localized, binary, data-driven features. The proposed approach is largely inspired by similar localized approaches in the computer vision domain. The success of these existing approaches coupled with their proven advantages of robustness and computational efficiency motivated us to apply these ideas to the ...
متن کاملText-dependent speaker recognition by efficient capture of speaker dynamics in compressed time-frequency representations of speech
Prevalent speaker recognition methods use only spectralenvelope based features such as MFCC, ignoring the rich speaker identity information contained in the temporalspectral dynamics of the entire speech signal. We propose a new feature called compressed spectral dynamics or CSD for speaker recognition based on a compressed time-frequency representations of spoken passwords which effectively ca...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011